Search CORE

122 research outputs found

Author Identifiers in Scholarly Repositories

Author: Warner Simeon
Publication venue
Publication date: 01/05/2009
Field of study

Bibliometric and usage-based analyses and tools highlight the value of information about scholarship contained within the network of authors, articles and usage data. Less progress has been made on populating and using the author side of this network than the article side, in part because of the difficulty of unambiguously identifying authors. I briefly review a sample of author identifier schemes, and consider use in scholarly repositories. I then describe preliminary work at arXiv to implement public author identifiers, services based on them, and plans to make this information useful beyond the boundaries of arXiv.Comment: 10 pages. Based on a presentation given at Open Repositories 200

arXiv.org e-Print Archive

Scholarly Materials And Research @ Georgia Tech

CiteSeerX

Journal of Digital Information (Texas Digital Library - TDL E-Journals)

Eprints and the Open Archives Initiative

Author: Warner Simeon
Publication venue: 'Emerald'
Publication date: 03/07/2003
Field of study

The Open Archives Initiative (OAI) was created as a practical way to promote interoperability between eprint repositories. Although the scope of the OAI has been broadened, eprint repositories still represent a significant fraction of OAI data providers. In this article I present a brief survey of OAI eprint repositories, and of services using metadata harvested from eprint repositories using the OAI protocol for metadata harvesting (OAI-PMH). I then discuss several situations where metadata harvesting may be used to further improve the utility of eprint archives as a component of the scholarly communication infrastructure.Comment: 13 page

arXiv.org e-Print Archive

Crossref

Exposing and harvesting metadata using the OAI metadata harvesting protocol: A tutorial

Author: Warner Simeon
Publication venue
Publication date: 01/01/2001
Field of study

In this article I outline the ideas behind the Open Archives Initiative metadata harvesting protocol (OAIMH), and attempt to clarify some common misconceptions. I then consider how the OAIMH protocol can be used to expose and harvest metadata. Perl code examples are given as practical illustration.Comment: 13 pages, 1 figure. Example programs included (download source). HEPLW version (HTML) available online at http://library.cern.ch/HEPLW/4/papers/3

arXiv.org e-Print Archive

E-LIS

CERN Document Server

Author identifiers: 1) Services at arXiv and 2) ORCID and repositories

Author: Warner Simeon
Publication venue: International Conference on Open Repositories : Proceedings
Publication date: 31/12/2010
Field of study

I will present two separate but related topics where experience with the first provides much of my perspective with the second. Public author identifiers and services based on them were introduced in March 2009 and early work and design was reported at OR09. The original services have been running for a year now and additional facilities have been added. I will report and uptake and usage patterns, and describe the more popular services. ORCID is an exciting initiative involving both commercial and academic participants that aims to build a registry and assign identifiers to address the author ambiguity problem. I will report on the current status of this rapidly evolving project and suggest how the repository community may contribute to and benefit from it

BieColl - Bielefeld eCollections

Author identifiers: 1) Services at arXiv and 2) ORCID and repositories

Author: Warner Simeon
Publication venue: International Conference on Open Repositories : Proceedings
Publication date: 31/12/2010
Field of study

BieColl - Bielefeld Electronic Collections

BieColl - Bielefeld eCollections

Plagiarism Detection in arXiv

Author: Gehrke Johannes
Ginsparg Paul
Sorokina Daria
Warner Simeon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

We describe a large-scale application of methods for finding plagiarism in research document collections. The methods are applied to a collection of 284,834 documents collected by arXiv.org over a 14 year period, covering a few different research disciplines. The methodology efficiently detects a variety of problematic author behaviors, and heuristics are developed to reduce the number of false positives. The methods are also efficient enough to implement as a real-time submission screen for a collection many times larger.Comment: Sixth International Conference on Data Mining (ICDM'06), Dec 200

arXiv.org e-Print Archive

CiteSeerX

eCommons@Cornell